Principal Component Analysis in the Era of «Omics» Data

نویسنده

  • Louis Noel Gastinel
چکیده

The «omics» era, also called classically the post-genomic era, is described as the period of time which extends the first publication of the human genome sequence draft in 2001 (International Human Genome Sequencing Consortium, 2001; Venter et al., 2001). Ten years after that milestone, extensive use of high-throughput analytical technologies, high performance computing power and large advances in bioinformatics have been applied to solve fundamental molecular biology questions as well as to find clues concerning human diseases (cancers) and aging. Principal «omics», such as Gen-omics, Transcript-omics, Proteomics and Metabol-omics, are biology disciplines whose main and extremely ambitious objective is to describe as extensively as possible the complete class-specific molecular components of the cell. In the «omics» sciences, the catalog of major cell molecular components, respectively, genes, messenger RNAs and small interfering and regulatory RNAs, proteins, and metabolites of living organisms, is recorded qualitatively as well as quantitatively in response to environmental changes or pathological situations. Various research communities, organized in institutions both at the academic and private levels and working in the «omics» fields, have spent large amounts of effort and money to reach. standardization in the different experimental and data processing steps. Some of these «omics» specific steps basically include the following: the optimal experimental workflow design, the technology-dependent data acquisition and storage, the pre-processing methods and the post-processing strategies in order to extract some level of relevant biological knowledge from usually large data sets. Just like Perl (Practical Extraction and Report Language) has been recognized to have saved the Human Genome project initiative (Stein, 1996), by using accurate rules to parse genomic sequence data, other web-driven. programming languages and file formats such as XML have also facilitated «omics» data dissemination among scientists and helped rationalize and integrate molecular biology data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Development of a cell formation heuristic by considering realistic data using principal component analysis and Taguchi’s method

Over the last four decades of research, numerous cell formation algorithms have been developed and tested, still this research remains of interest to this day. Appropriate manufacturing cells formation is the first step in designing a cellular manufacturing system. In cellular manufacturing, consideration to manufacturing flexibility and productionrelated data is vital for cell formation....

متن کامل

Outlier Detection in Wireless Sensor Networks Using Distributed Principal Component Analysis

Detecting anomalies is an important challenge for intrusion detection and fault diagnosis in wireless sensor networks (WSNs). To address the problem of outlier detection in wireless sensor networks, in this paper we present a PCA-based centralized approach and a DPCA-based distributed energy-efficient approach for detecting outliers in sensed data in a WSN. The outliers in sensed data can be ca...

متن کامل

An Empirical Comparison between Grade of Membership and Principal Component Analysis

t is the purpose of this paper to contribute to the discussion initiated byWachter about the parallelism between principal component (PC) and atypological grade of membership (GoM) analysis. The author testedempirically the close relationship between both analysis in a lowdimensional framework comprising up to nine dichotomous variables and twotypologies. Our contribution to the subject is also...

متن کامل

Feature reduction of hyperspectral images: Discriminant analysis and the first principal component

When the number of training samples is limited, feature reduction plays an important role in classification of hyperspectral images. In this paper, we propose a supervised feature extraction method based on discriminant analysis (DA) which uses the first principal component (PC1) to weight the scatter matrices. The proposed method, called DA-PC1, copes with the small sample size problem and has...

متن کامل

ropls: PCA, PLS(-DA) and OPLS(-DA) for multivariate analysis and feature selection of omics data

4 Hands-on 3 4.1 Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 4.2 Principal Component Analysis (PCA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4.3 Partial least-squares: PLS and PLS-DA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4.4 Orthogonal partial least square...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012